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ABSTRACT 

There are four steps it obtaining afn,, 3 ** 1 *, each 
reguiring plannlao an.d decision-making : assobli D 9 » * n *e» 
adeinistering tflts, scoring, and recording 0Cot* B t m * *««e«biing 
students, two oewtions are in order, first, for n ° ** r enced 
tests, students in the treatment group should tu* 6 ^ *«*t a t the 
saie tiie of the year as the student* in th« «o* , *?t«J* ,p1 ** Second* 
high absenteeism, differences aeong test adiini#* r ^ff % t and in 
testing environments may significantly aff« c t pt^ZZZl^^ttmmt 
scores. Publisher's eanuals should be folic w«d e**^ 2 ensure 
consistency between treatsent and comparison grC°P^I d c> a reseat 0 * 1 
viewpoint, trained test administrator* should b« *etbar than 

classroom teachers. Scoring decisions inclua* % answer 

form (eachine-scorable boolrlets vs. answer apart 0 ) ' * l^tM* a 
scoring agent (school personnel, test fob limpet, fj,*J"I?*P^BKnt tes* 
scoring company) ; and costs of sco: 
analysis. Finally, when recording 
carefully proofread. All 

arranged to facilitate anu,^ u . — — . 



>ring and <rf a*** J 1 !? etatistical 

™ , scores, data t ot \Z fi 0u H be 

scores should be ccspie* e £ 7 .;; Q «nU£i«d and 
talysis. Eleven e«9gestJ c ?J t * * detigning 
test results forms discuss page numbers, testing 9toup 
identification, confidentiality, test name, os* 'of w «h« % ; 
arrariqeient and fonat o$ names, identification D0 €r *; tui^p of 
entries, and arranceient of cclams. (CP) 
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Once an evaluation design and an appropriate 

achievement test are chosen, the most crucial 
step in the evaluation process is the collection 
of accurate, complete data^ Analysis of the data 
may be a more technically complex step, but at 
least, when analysis errors are discovered, they 
can usually be corrected. On the other hand, if 
data are distorted or misslnft , no amount of analy- 
ses can. adequately correct .the- problem,, if there 

/•are too many flaws in the raw, data, the entire 

/ evaluation becomes meaningless, 

f $jS* re are roiir steps 'in obtaining test data, 

each requiring planning antJ decisions: ($) as- 
sembling the students, (b) administering the 
tests, (c) scoring the tests, and (d) recording 

the scores-, f * 

ASSEMBLING STUDENTS FOR TESTING 
0 

This step, often passed over lightly, is im- 
/ portant for two reasons. First, of course, the 
time of day and the place where students are 
assembled may affect test scores. The date of 
testing may also be important. In the norm- 
referencarf model, for' example , it is critical 
that students in the treatment group take the 
test at die same time of year as the students • 
in the* emitting sample. Second, any changes in 
the way the test is given that are made between 
the pretest and post test may significantly affectt 
test scores. A cjtange su^h as testing students 
in their classrooms rather than in a large assem- 
bly hall may or may not make a difference in 
scores, but the only way to be safe is to use ex- 
actly the same procedures each time. Changing 
from independently administered pretests to post- 
tests administered by classroom teachers because 
the money ran out — or vice versa because money 
was Jteft ovex — is an example of a pracflice .which 



should be carefully avoided* Careful planning, 
could avoid all such problems. ^ 

It fs d if ficul t, to generalize about rules for 
assembling students because 'Of the wide differ- 
ences among schools, Mo,st important is to mini- 
mijtc the disruption to the students while ensuring' 
that all treatment and comparison students c«n 
take both pre- and post tests under similar test- 
ing conditions. The major problems in achieving 
this goal are high absenteeism, differences among 
test administrators, and; d if f erences in testing 1 
environments. Uh&fe the evaluation apjaply in- 
volves testing project students in their Regular 
project setting, fjew problems should be encoun- 
tered. On the other hand, the situation may be 
more complicated if control students are involved, 
or if students are to be tested before the, project 
begiiis or after it ends. Under these circum- 
stances, it is well worth the effort to lay out 
in detail the number of different tests or test 
levels to be used, the number of test locations, 
the time for each test, the number of make-up 
sessions, the number of special test adminis- 
trators or 1 supervisors, and so on. Testing often . 
turlis out to be a bigger project than anticipated, 
a-nd , if resources are limited, it is better to 
.simplify ^ ot * 1 the pretest and posttest rather 
than to expend so much effort on the pretest ^ 
that post testing cannot fee accomplished in an 
adequate fashion. 



ADMINISTERING THE TESTS 

Ensuring Consistency 

It goes without saying that te^iT^adrain 1st ra- 
tion should be orderly, and that cheating and 
fc^he^j: irregularities are not permissible. But 
orderliness is not enough. For< the purposes of 
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evaluation it is necessary to Mve " consistency , 
There two kinds of consistency to, worry about, ' 

dependjjig on whether a norm-ref erenced^or compari- 
son-group evaluation design is used. If a norro- 
rpfr>re v nced design is used, the critical thing is 
to be sure that the test publisher's procedures 
are followed exactly. This specifically includes 
reading instructions, answering questions, doing 
practice problems, and timing each section. 

^fien a comparison /(roup is used, it is still 
advis*b* e to follow tfie publisher's instructions 
to the letter in order to make supplementary norm- 
referenced comparisons possibles The most criti- 
cal thing, however, is to maintain close simi- 
larity between treatment- and compar ison-greup 
testing situations. The simplest way to ensure 
comparable situations is to test both treatment 
and comparison students as a single group. Usu- 
ally, however, in either norm-referenced or corn- 
par ison-group designs it will be necessary to test 
several groups. Then special steps must be taken 
to make sure that they are tested, under as similar 
conditions as possible. Even here, there ,are many 
possible problems; for example, bringing compari-, < 
son-gtoup pupils into an unfamiliar project lab 
for testing may put them at a disadvantage. 

Training Test Administrators 

There are two basic ways of making test admin- 
istrations comparable. One is to use a few trained 
test administrators to test all the groups. The 
other is to train the regular teachers to give the 
tests to their own students. The latter ^al terna- 
tive is much less desirable from a research view- 
point. H teachers must be used, it is advisable 
to have them test each others' classes to minimize 
possible biases. .T v v i 

Simply telling teachers or other test admin- 
istrators to look over the test manual i s never' 




are. Eactrxteat afcimlnlstrator should be im- 



pressed with the Hmjkfrimnce of following proce- 
& ^iu^JL exac t ly . Each one should at least have 
"w^Ked through" the entire process, from handing 
out pencils to collecting the tests, before ever 
administering the test in an evaluation. Where 
teacher judgments are involved in scoring student 
responses (as in oral reading tests), much more 
training is required.' * * 

SCORING THE TESTS X 

The most important scoring requirement iQ ac- 
curacy, but there are trade-offs of time and-' money 
to consider. The important variables are what\ 
type of answer form to use and who does the sc 
ing. * v . 




Selecting an Answer Fpnrt * £ t T^f^l 

Most ^of the major ^teBtu ban be purchased with 
machine-scorable^b^k^etd or separate answer \ i 
sheet 8. 'Some non-standardized tests may b^e avail- 
able only in .hand ^scored versions. The main fac — 
tor* in choosing among answer forms is the' age of ,$j 
the students. Separate answer sheets are usually ' 
ifuch easier to process, but young childfren ^end , 
to score lower Aft these forms, presumably Because 
the forms art ccrnfirflng to them. In general, sep- 
arate answer sheets are suitable for aj>6v art avt rage 
fourth graders and all'oider students.^ Yotkngei; 
children should ilse macttyne-gcorable or hand- 
scored booklets (Harcourt BrSfce Jovanovich/ Inc., « 
1973). ' " 

Selecting a Scoring Procedure - * *f ; 

Whichever type of form is used, there ^aVe ^ 
three basic wa^s of having the test scored. Scor- 
ing can be done by: (a) local school personnel , 

% 



(b) the publish of the te ftt ' or (c) * n lnd «Pen» 
dent t««c .coring co «P««y. A'choice b»**** n J*m. 
test publisher or an inde pe|UJent COB)pa oy will 
depend Q n the local 8it «*«tio n and the te8t that 
is chosen. Cost. tur M to Un(| tl-e gfcd types end 
quality of service may v ary# shopping aro und i a 
in order. The t*i or ***iai 0n ho wev*r. 18 whether- 
to have the -scoring done by either tyP e of 8er *ic e 
or simply by avail** 1 * school personnel' The 
major advantages ° f a gooVi scoring service are 
accuracy and the variety of analyge8 provided by 
couuter orocessing. The » a j or disadvantages ar« 

in nr 0 ». , _ answer 



re- 



Cost Considerations 

"Ballpark" cost figures for nachine- 800 ^ 
fonss (taken froa on * wi dely pub lisher's ser- 

vice)" range fro* $•-*> to $, ?0 pup il* depending 
on the type of fa" 1 and length of the test battery. 
Hand-scored booklet* cos^ three or four times as 
much, to score, although a lower origin fll purchase 
price will offset this di fference sli ghtly. Clear- 
ly, local personnel can do scoring at 
lower cost, but included in a service's 
price are a number or fe ature8 that ar e costly, 
time consuming, and prone to error when scoring 
ia done by hand. ™»e ^cl^. ( a ) reports. with 
convenient formats in tripi^^ f{>r ea ch group 
(e.g., class), completely identifled a s 5°,^". 
date, group, etc! (b) r aw scores. -percentile 
scores (local or national distributions)' 8t endar d 
scores, and; in some instances NCEs tot *ach stu- 
dent on each subtest; ( C ) mean* standard scores 
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for each group. Several other analyses are avail- 
able for prices ranging from an additional $.95 
to $.12 per stutent for each analysis. These in- 
clude score disWibutions for each class" item 
analyses , and ind iv idual student prof iles • Add i- 
tional statistical analyses are readily available, 
or, for schoo Is with access to their own computer 
facilities, the scores are available from the pub- 
lisher on computer cards or tape. ; 

In short, for very small tryouts with simpJLe 
analyses it may be desirable to dd the. entire job 
locally. Unless local computer facilities are 
avai lable , however , more extensive evaluat ions 
may well be completed more accurately, thoroughly, 
anc) economically with the /help of a scoring ser- 
vice. All the raaj or^serv ices have literature and 
consultants to provide— details and to assist in 
planning the scoring and analysis. 

RECORDING THE SCORES 

Record ing the scores is the f inal ste> in the 
data collection process, but to ensure that the 
scores will be usable, the details of recording 
should be worked out well before pretest time. 
If you use a commercial scoring service, you may 
have little control over the recording process. 
If you decide to do your own scoring, or if you 
want to transfer scores from computer printouts, 
to a more convenient form, you must consider two 
important issues: accuracy of the data, and de- 
tails of the data-recording forms. 

Copying scores accurately onto data forms is 
not a complicated problem for small-scale local 
studies, but the possibility of errors must not be 
overlooked. Even the most conscientious record- 
ers make errors/ All data forms should be care- 
fully proofread, preferably with one person read- 
ing aloud while a second person checks the scores. 



The details of the data forms might appear to 
be of little importance, but, in aany school dis- 
tricts, the way in which data have been recorded 
virtually precludes any reasonable analyses. TVo 
general principles Must be observed when deciding 
upon a standard data format. First, a*ll scores 
nust be completely identified, and second, acores 
must be arranged in a way that facilitates analy- 
sis. 

Considerations for Dpt a -Recording forms 

The following considerations should be incor- 
porated into any data-recording form and are il- 
lustrated by the building level worksheet forms 
that accompany each model. 

s Page numbers . ^tf>st sets of scores require 
more than one page* A page number at the 
top should identify each sheet and the 11 num— > 
ber of pages" helps make sure no pages are 
missing • 

s Testing dates . Test dates are critical, espe- 
cially in norm- referenced, evaluations.' • Record 
the date of the original* testing and make-up 
testing sessions for both the pretest and 
post test • 

s Croup ident if ication . Identify clearly the 
group for which data are recorded near the 
top of the page to simplify the retrieval 6t 
that group' s data* from a large data base. 

s Provision for anonymity . Arrange the page 
so that it can be photocopied without the stu- 
dent's name. This permits possible later use 
of the data for research purposes without com- 
promising student privacy. , 

• Test name . It simplifies analysis gteatly 

to have only one test (pre and post) recorded 
^ on each sheet, provided the rules for listing 



students suggested bel^w are followed. List 
the complete name of the pretest and posttest 

(taken exac t ly fro* the test booklets and in- 
c lading publication date ) f 

• Single pre/post data sheet . Identifying stu- 
dents and organizing their naaes efficiently 

X are the most difficult problems in recording 
studenjt data. Where evaluations are only for 
one year and are based on # fall and*spring 
testing, the problems can be solved with a 
UfclXeffbrt and care. But where students 
•must be^followed over several years, the prob- 
lems are more difficult since students cope 
and go from projects, and groups are reorgan- 
ized every yea*. The simplest rule is to make 
sure that the posttest scores are all entered 
on the same sheet of paper as the correspond- 
ing pretest scores. This at least eliminates 
the problem of tjrying to find each student's 
name on two lists. 

• Standard order of names . A second rule for 
listing student names is to establish a stan- 
dard ordering of the names, and stick to it 
fpr the life of the evaluation and frfr all 
tests that are used . If a student moves or 
fails to take some of the tests, then the ap- 
propriate entries are blank, but he should 
not be eliminated from the list. If new Stu- 
dents enter the program, their names should 
be added to the end of the lists for all 
tests, even those for which no data will be 
entered. Besides a reduction in confusion, 
there are some practical advantages to this 
procedure. For example, a master form can be 
prepared with only the students' names and 
identification numbers filled in, and the 
forms can simply be duplicated when new tests 
are given. It also makes comparisons or cor- 
relations between* any two sets of scores rela- 
tively easy because any two forms can be laid 
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side by side and the corresponding names will 
line up correctly. If there is a compelling 
r aa sort to change the order of student tunes 

in the middle of- a project, then either change 
' all forms, or Maintain a double set of forms 
(old and new order). 

a .Standard form for names . Establish a rule 
for recording names. "Caldwell, p.E. rt should 
never become "Danny Caldwell 11 on a second list. 
The s4mpieat procedure is to .allow plenty of 
space an$j to spell out first names and include 
Middle initials (e.g., Caldwell, Daniel E.).^* 

a L«D* numbers . If l.D. numbers are u*ed, each 
stJdept should have an l.D. number that iden- 
tifies him completely. For example, different 
digits night identify the .student either as a 
member of the project group or a control group, 
indicate class or sex, and of course, repre-, 
sent the individual student. In some evalua- 
tions, other codes (including letters} can 
be used, but careful planning is necessary, 
in order Co permit any desired grouping simply 
by l.D. number. 

a Unlfcofm number .of entries . A page should have 
some reasonable number of entries 7 (e.g.,\20, 
25, 30) » and the number should not vary from 
page to page. 

a Pre/post score columns . Keep pre- and post- 
test scores in adjacent columns. For example, 
enter the raw scores for pretest and post test 
in two columns, percentile scores for each in 
the next two columns, etc., instead of pairing 
each pretest raw score wi th its standard .score , 
percentile score, etc., followed by each post- 
test score and its transformations. This 
'greatly simplifies the .mechanics of analysis; 
comparisons are nearly always made between 
pre- and post test scores of the same type. 
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